A Distributed Domino-Effect free recovery Algorithm
نویسندگان
چکیده
Recovery techniques may be distinguished on the basis of the time when the recovery lines are built: at the time of recording the recovery point, at the time of rollback. Consequently we distinguish "planned " and "unplanned" policies for determining recovery lines. With an unplanned policy a "domino effect" can occur. The planned policy is usually intended as being static, in the sense that the recovery lines are a priori established at design time. In this paper an algorithm for "dynamic" planning of recovery line is specified. We shall define a computational model for a distributed system of communicating processes using asynchronous message passing and shall describe the recovery algorithms by means of axioms.
منابع مشابه
Efficient Checkpoint-based Failure Recovery Techniques in Mobile Computing Systems
Conventional distributed and domino effect-free failure recovery techniques are inappropriate for mobile computing systems because each mobile host is forced to take a new checkpoint (based on coordinated checkpointing). Otherwise, multiple local checkpoints may need to be stored in stable storage (based on communication-induced checkpointing). Hence, this investigation presents a novel domino ...
متن کاملEfficient Techniques for Adaptive Independent Checkpointing in Distributed Systems
This work presents two novel algorithms to prevent rollback propagation for independent checkpointing: an efficient adaptive independent checkpointing algorithm and an optimized adaptive independent checkpointing algorithm. The last opportunity strategy that yields a better performance than the conservation strategy is also employed to prevent useless checkpoints for both causal rewinding paths...
متن کاملCharacterization of Consistent Global Checkpoints in Large-Scale Distributed Systems
Backward error recovery is one of the most used schemes to ensure fault-tolerance in distributed systems. It consists, upon the occurrence of a failure, in restoring a distributed computation in an error-free global state from which it can be resumed to produce a correct behaviour. Checkpointing is one of the techniques to pursue the backward error recovery. As we consider large-scale distribut...
متن کاملA checkpointing-recovery scheme for domino free distributed systems
Many communication induced checkpointing algorithms have been proposed for asynchronous cooperating processes. All of them suffer from overhead due both to the exchange of control information and to the insertion of local checkpoints additional to the basic ones. In this paper we propose a low overhead checkpointing-recovery scheme. It consists of a domino-free checkpointing algorithm plus an a...
متن کاملOptimistic Crash Recovery without Changing Application Messages
We present an optimistic crash recovery technique without any communication overhead during normal operations of the distributed system. Our technique does not append any information to the application messages, it does not suffer from the domino effect, and each processor rolls back at most once during recovery. We present three distributed rollback algorithms, their complexities, and correctn...
متن کامل